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An isolated nucleic acid melocule is provided which encodes a mammalian signal mediator protein involved in regulation of cellular 
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binding motifs, and a carboxy-terminal effector domain that can induce pseudohyphal budding in yeast. The invention also provides 
the novel signal mediator protein, and antibodies thereto. These biological molecules are useful as research tools and as diagnostic and 
therapeutic agents for the identification, detection and regulation of complex signaling events leading to morphological, potentially neoplastic, 
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NUCLEIC ACID ENCODING A SIGNAL MEDIATOR PROTEIN 
THAT INDUCES CELLULAR MORPHOLOGICAL ALTERATIONS 



Pursuant to 35 U.S.C. §202 (c) , it is hereby 
acknowledged that the U.S. Government has certain 
rights in the invention described herein, which was 
made in part with funds from the National Institutes of 
5 Health. 

FIELD OF THE INVENTION 

This invention relates to diagnosis and 
treatment of neoplastic diseases. More specifically, 
10 this invention provides novel nucleic acid molecules, 
proteins and antibodies useful for detection and/or 
regulation of complex signalling events leading to 
morphological and potentially neoplastic cellular 
changes . 

15 

BACKGROUND OF THE INVENTION 

Cellular transformation during the 
development of cancer involves multiple alterations in 
the normal pattern of cell growth regulation. Primary 

20 events in the process of carcinogenesis involve the 
activation of oncogene function by some means (e.g., 
amplification, mutation, chromosomal rearrangement) , 
and in many cases the removal of ant i -oncogene 
function. In the most malignant and untreatable 

25 tumors, normal .restraints on cell growth are completely 
lost as transformed cells escape from their primary 
sites and metastasize to other locations in the body. 
One reason for the enhanced growth and invasive 
properties of some tumors may be the acquisition of 

30 increasing numbers of mutations in oncogenes, with 

cumulative effect (Bear et al . , Proc . Natl. Acad. Sci . 
USA 86:7495-7499, 1989) . Alternatively, insofar as 
oncogenes function through the normal cellular 
signalling pathways required for organismal growth and 
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15 



cellular function (reviewed in McCormick, Nature 
363:15-16, 1993), additional events corresponding to 
mutations or deregulation in the oncogenic signalling 
pathways may also contribute to tumor malignancy (Gilks 
et al., Mol. Cell Biol. 13:1759-1768, 1993), even 
though mutations in the signalling pathways alone may 
not cause cancer. 

Several discrete classes of proteins are 
known to be involved in conferring the different types 
of changes in cell division properties and morphology 
associated with transformation. These changes can be 
summarized as, first, the promotion of continuous cell 
cycling (immortalization); second, the loss of 
responsiveness to growth inhibitory signals and cell 
apoptotic signals; and third, the morphological 
restructuring of cells to enhance invasive properties. 

Of these varied mechanisms of oncogene 
action, the role of control of cell morphology is one 
of the least understood. Work using non- transformed 
mammalian cells in culture has demonstrated that simply 
altering the shape of a cell can profoundly alter its 
pattern of response to growth signals (DiPersio et al 
Mol. Cell Biol. 11:4405-4414, 1991), implying that 
control of cell shape may actually be causative of, 
25 rather than correlative to, cell transformation. For 
example, mutation of the antioncogene NF2 leads to 
development of nervous system tumors. Higher 
eucaryotic proteins involved in promoting aberrant 
morphological changes related to cancer may mediate 
additional functions in normal cells that are not 
obviously related to the role they play in cancer 
progression, complicating their identification and 
characterization. Identification and characterization 
of such genes and their encoded proteins would be 
beneficial for the development of therapeutic 
strategies in the treatment of malignancies. 

Recent evidence suggests that certain key 
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proteins involved in control of cellular morphology 
contain conserved domains referred to as SH2 and SH3 
domains. These domains consist of non-catalytic 
stretches of approximately 50 amino acids (SH3) and 100 
5 amino acids (SH2, also referred to as the "Src homology 
domain"). SH2/SH3 domains are found in cytoskeletal 
components, such as act in, and are also found in 
signalling proteins such as Abl . The interaction of 
these proteins may play a critical role in organizing 

10 cytoskeleton-membrane attachments . 

Besides the numerous SH2/SH3 containing 
molecules with known catalytic or functional domains, 
there are several signalling molecules, called "adapter 
proteins," which are so small that no conserved domains 

15 seem to exist except SH2 and SH3 domains. Oncoproteins 
such as Nek, Grb2 /Ash/SEM5 and Crk are representatives 
of this family. The SH2 regions of these oncoproteins 
bind specific phosphotyrosine-containing proteins by 
recognizing a phosphotyrosine in the context of several 

20 adjacent amino acids. Following recognition and 
binding, specific signals are transduced in a 
phosphorylation dependent manner. 

As another example, P47v-Crk (CrK) is a 
transforming gene from avian sarcoma virus isolate 

25 CTIO, This protein contains one SH2 and one SH3 
domain, and induces an elevation of tyrosine 
phosphorylation on a variety of downstream targets. 
One of these targets, pl30cas, is tightly associated 
with v-Crk. The SH2 domain of v-Crk is required for 

30 this association and subsequent cellular 

transformation. P130cas is also a substrate for Src 
mediated phosphorylation. Judging from its structure, 
pl30cas may function as a "signal assembler" of Src 
family kinases and several cellular SH2 -containing 

35 proteins. These proteins bind to the SH2 binding 
domain of pl30cas, which is believed to induce a 
conformational change leading to the activation in 
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inactivation of downstream signals, modulated by 

multiple domains of the protein. 

Another oncogene, Ras, is a member of a large 

evolutxonarily conserved superfamily of small GTP- 
bxndxng proteins responsible for coordinating specific 
growth factor signals with specific changes L cell 
shape, including the development of stress fibers and 
membrane ruffles (Ridley and Hall, Cell 70:389-399 
19S2. Ridley et al . , Cell 70:401-410,1992). A rap;dly 
10 growxng family of oncoproteins, including Vav Bcr 
Ect-2, and Dbl , has been found to be involved in a' 

ZTIZ !! "'"^^^^^ — Aaronson, Nature 

316:273-275, 1985; Ron et al . , EMBO J. 7:2465-2473 

15 i!'''".T" 2 = ^11-618, 1992; Miki'et 

al.. Nature 362:462-465, 1993). Proteins of this 
family have been shown to interact with Ras/Rac/Rho 
famxly members, and possess sequence characteristics 
that suggest they too directly associate with and 

^ modulate organization of the cytoskeleton . 

° ^ ^^^^ °^ significant relationship 

between signalling or "adapter" proteins, altered 
cellular morphology and the development of cancer it 
wou d , ^^^^^^^^ isolate 'such 

prote.ns (or genes encoding them) for the purpose of 
developing diagnostic/therapeutic agents for the 
treatment of cancf^r- rt- -,• ^ • 

cancer. it is an object of the present 

invention to provide a purified nucleic acid molecule 
of mammalian origin that encodes a signal mediator 
protein (SMP) involved in the signalling cascade 
related to morphological cellular changes, and 
therefrom provide isolated and purified protein. Such 
a gene, when expressed in model systems, such as yeast 
wxll provide utility as a research tool for identifying 
genes encoding interacting proteins in the signalling 
' cascade, thereby facilitating the elucidation of the 
mechanistic action of other genes involved in 
regulating cellular morphology and cell division The 
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gene may also be used diagnostically to identify 
related genes, and therapeutically in gene augmentation 
or replacement treatments. It is a further object of 
the present invention to provide derivatives of the 
5 SMP-encoding nucleic acid, such as various 

oligonucleotides and nucleic acid fragments for use as 
probes or reagents to analyze the expression of genes 
encoding the proteins. It is a further object of the 
invention to provide the signal mediator protein in 
10 purified form, and to provide antibodies 

immunologically specific for the signal mediator 
protein for the purpose of identifying and quantitating 
this mediator in selected cells and tissues. 

15 SUMMARY OF THE INVENTION 

This invention provides novel biological 
molecules useful for identification, detection and/or 
regulation of complex signalling events that regulate 
cellular morphological changes. According to one 

20 aspect of the present invention, an isolated nucleic 

acid molecule is provided that includes an open reading 
frame encoding a mammalian signal mediator protein of a 
size between about 795 and about 875 amino acids in 
length (preferably about 834 amino acids) . The protein 

25 comprises an amino- terminal SH3 domain, an internal 
domain that includes a multiplicity of SH2 binding 
motifs, and a carboxy- terminal effector domain. When 
produced in Saccharomyces cerevisia.e , the carboxy- 
terminal effector domain is capable of inducing 

3 0 pseudohyphal budding in the organism under pre- 
determined culture conditions. In a preferred 
embodiment, an isolated nucleic acid molecule is 
provided that includes an open reading frame encoding a 
human mammalian signal mediator protein. In a 

3 5 particularly preferred embodiment, the human signal 
mediator protein has an amino acid sequence 
substantially the same as Sequence I.D. No, 2. An 
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exemplary nucleic acid .olecula of the invention 
comprises Sequence I.D. No. 1. 

According to another aspect of the present 
xnventxon, an isolated nucleic acid molecule Ts 
5 provided, which has a sequence selected from th. 

consisting of- (i) ^''^"P 
-^"y or. (1) Sequence I.D. No. i- ^ 

hybridizing with part or ^ u ' sequence 

str^nH f ! °^ ''^^ complementary 

strand of Sequence I.D. No. l and encoding a 

polypeptide substantially the same as part or all . 
0 polypeptide encoded by Sequence I.D. No. 

sequence encoding part or all of a polypepM^ h 
amino acid Sequence I.D. No. 2. ^^'^^^^^^^^ ^ 

According to another aspect of the present 
-ventxon, an isolated nucleic acid molecule L 
provided Which has a sequence that encodes a carboxy 

P otem. This domain has an amino acid sequence of 

reater than ... similarity to the portionlf Z^L^ 
I-D. NO. 2 comprising amino acids 626-334 

According to another aspect of the present 
^nvantion, an isolated mammalian signal mediator 
protein is provided which has a deduced molecular 
weight Of between about 100 tea and 115 .Da (preferablv 

bout loa .Ba,. The protein comprises an aminl 
terminal SH3 domain, an internal domain that includes a 
multiplicity Of SH2 binding motifs, and a carboxy 
terminal effector domain, which is capable of ■ 
P^eudohyphal budding in .accharo..ces'::: v , ^1:! 
pre-determined culture conditions, as decribed in 
greater detail hereinbelow. i„ a preferred embodiment 
Of the invention, the protein is of human origin 2T 
has an amino acid se^ence substantially the 12 T 
Sequence I.D. No. 2. 

According to another aspect of the present 
invention, an isolated mammalian signal mediator 

efleclor'Lr^'f': ^"'^^ ^ carboxy- terminal 

effector domain having an amino acid sequence of 
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greater than 74% similarity to the portion of Sequence 
I.D. No. 2 comprising amino acids 626-834. In a 
preferred embodiment, the amino acid sequence of the 
carboxy- terminal effector domain is greater than about 
5 50% identical to that portion of Sequence I.D. No. 2. 

According to another aspect of the present 
invention, antibodies immunologically specific for the 
proteins described hereinabove are provided. 

Various terms relating to the biological 

10 molecules of the present invention are used hereinabove 
and also throughout the specifications and claims. The 
terms "substantially the same, " "percent similarity" 
and "percent identity (identical) " are defined in 
detail in the description set forth below. 

15 With reference to nucleic acids of the 

invention, the term "isolated nucleic acid" is 
sometimes used. This term, when applied to DNA, refers 
to a DNA molecule that is separated from sequences with 
which it is immediately contiguous (in the 5' and 3' 

20 directions) in the naturally occurring genome of the. 
organism from which it was derived. For example, the 
"isolated nucleic acid" may comprise a DNA molecule 
inserted into a vector, such as a plasmid or virus 
vector, or integrated into the genomic DNA of a 

25 procaryote or eucaryote . 

With respect to RNA molecules of the 
invention, the term "isolated nucleic acid" primarily 
refers to an RNA molecule encoded by an isolated DNA 
molecule as defined above. Alternatively, the term may 

30 refer to an RNA molecule that has been sufficiently 
separated from RNA molecules with which it would be 
associated in its natural state (i.e., in cells or 
tissues) , such that it exists in a "substantially pure" 
form (the term "substantially pure" is defined below) . 

35 With respect to protein, the term "isolated 

protein" or "isolated and puri'fied protein" is 
sometimes used herein. This term refers primarily to a 
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protein produced by expression of an isolated nucleic 
acid molecule of the invention. Alternatively, this 
term may refer to a protein which has been sufficiently 
separated from other proteins with which it would 
5 naturally be associated, so as to exist in 
"substantially pure" form. 

The term "substantially pure" refers to a 
preparation comprising at least 50-60% by weight the 
compound of interest (e.g., nucleic acid, 

10 oligonucleotide, protein, etc.). More preferably, the 
preparation comprises at least 75% by weight, and most 
preferably 90-99% by weight, the compound of interest. 
Purity is measured by methods appropriate for the 
compound of interest (e.g. chromatographic methods, 

15 agarose or polyacrylamide gel electrophoresis, HPLC 
analysis, and the like) . 

With respect to antibodies of the invention, 
the term "immunologically specific" refers to 
antibodies that bind to one or more epitopes of a 

20 protein of interest (e.g., SMP) , but which do not 

substantially recognize and bind other molecules in a 
sample containing a mixed population of antigenic 
biological molecules . 

With respect to oligonucleotides, the term 

25 "specifically hybridizing" refers to the association 
between two single-stranded nucleotide molecules of 
sufficiently complementary sequence to permit such 
hybridization under pre -determined conditions generally 
used in the art (sometimes termed "substantially 

30 complementary") . In particular, the term refers to 
hybridization of an oligonucleotide with a 
substantially complementary sequence contained within a 
single -stranded DNA or RNA molecule of the invention, 
to the substantial exclusion of hybridization of the 

35 oligonucleotide with single-stranded nucleic acids of 
non-complementary sequence. 

The nucleic acids, proteins and antibodies of 



BNSDOCI0:<WO 9702362A1> 



wo 97/02362 



PCT/US96/10823 



the present invention are useful as research tools and 
will facilitate the elucidation of the mechanistic 
action of the novel genetic and protein interactions 
involved in the control of cellular morphology. They 
5 should also find broad utility as diagnostic and 

therapeutic agents for the detection and treatment of 
cancer and other proliferative diseases. 

BRIEF DESCRIPTION OF THE DRAWINGS 

10 FIGURE lA-Figure ID- Alignment of nucleotide 

sequence (Sequence I,D. No, 1) and deduced amino acid 
sequence (Sequence I.D. No. 2) of HEFl, a cDNA of human 
origin encoding an exemplary signal mediator protein of 
the invention, 

15 FIGURE 2- Amino acid sequence alignment of 

the deduced amino acid sequence of HEFl (Sequence I.D, 
No, 2) with homologous sequences of pl30cas from rat 
(Sequence I.D. No 3). Boxes represent regions of 
sequence identity between the two proteins. The closed 

20 circle marks the site of the initial methionine in the 
truncated clone of HEFl. The thick underline denotes 
the conserved SH3 domain. Tyrosines are marked with 
asterisks. 

FIGURE 3. Amino acid sequence alignment of 
25 the carboxy-terminal regions of HEFl-encoded hSMP with 

pl30cas and the mouse homolog of hSMP, mSMP encoded by 

MEFl (Sequence I.D. No, 4). 

DETAILED DESCRIPTION OF THE INVENTION 

In accordance with the present invention, a 
30 novel gene has been isolated that encodes a protein 

involved in the signal transduction pathway that 

coordinates changes in cellular growth regulation. 

This protein is sometimes referred to herein as "signal 

mediator protein or "SMP," 
35 Using a screen to identify human genes that 

promote psuedohyphal conversion in the yeast 

Saccharomyces cerevisiaB , a 900 bp partial cDNA clone 

SUBSTITUTE SHEET (RULE 26) 
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was Obtained that causes strong pseudohyphal growth of 

sh.£t fro. normal to "pseudohyphal » budding in yeLt 
has been shown to involve the action of grlth 
regulatory Kinase cascades and cell cycle-related 
transcription factors (Gimeno pink. „ol Cell Biol 
.100 .1.., ,,,,, ^^^^ ^Z^t.o. 

llll' " ^l-' Biol. 13, 5567-5581 

1593; Liu et al. science 2£2: 1741-1744, 1553) 

orob ■ "^'"^ ""^ ■'^""^ as a 

- ™^"^ approaches. 



a full- 
This 



length clone of approximately 3 . 7kb was isolated. This 

llZl Z ' ^'"''^ fading frame : 

about 934 amino acids, which constitutes the signal 
mediator protein of the invention. SMP is 

Id1a":nt"d"' " amino- terminal SH3 domain and an 
adlacent domain containing multiple SH2 binding motifs 
The protein also contains a carboxy terminal "effector.', 
domain that is capable of inducing the shift to pseudo- 
hyphal budding in yeast. . cD.A encoding a mouse 
homolog of the carboxy-terminal ..effector., region has 
also been identified (Figure 11 u , 

thB r.„K u Ii-igure 3) . Homology searches of 

the GenbanK data base revealed an approximately 64. 
similarity on the amino acid level between SMP from 
human and the adapter protein, pl30cas, recently cloned 
from rat ,as disclosed by Sa.ai et al . , E„BO J. 13 

743-3756, 1„4,. However, pl30cas is signif icai^Ily 
larger than SMP ,568 amino acids for rat pi30cas veLus 

34 ammo acids for human SMP, , and differs with repact 
to amino acid composition. A comparison of SMP with 
PlBOcas IS set forth in greater detail in Example 1 
► u '^^^ aforementioned human partial cDNA clone 

onrv the ^^^"^"'•^''-l tion in yeast encodes 

only the carboxy-terminal portion of SMP, comprising 
about 132 amino acids. The enhancement of psLdohy^hal 
formation by the carboxy-terminal fragment of SMP,Tn 
addition to the relatively high degree of homology with 
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pl30cas over this region, indicates that it is this 
domain that acts as an effector in regulating cellular 
morphology. Thus, this doma in is sometimes referred to 
herein as a "C-terminal effector domain." It should be 
5 noted that, although the carboxy- terminal fragment of 
pl30cas was also found capable of enhancing 
pseudohyphal formation, it did not do so to the same 
extent as .the C-terminal domain of SMP (on a scale of 1 
to 10, the SMP C-terminal domain is a "10," while the 

10 pl30cas C-terminal domain is a "6"). The SMP C- 
terminal domain was also found to be involved in 
homodimerization and in heterodimerizat ion with plBOcas 
and, like pl30cas, associates with Abl and appears to 
be phosphorylated by Abl . 

^5 Thus, SMP can be classified within a family 

of docking adapters, which includes pl30cas, capable of 
multiple associations with signalling molecules and 
transduction of such signals to coordinate changes in 
cellular growth regulation. The SMP protein comprises, 

20 from amino- to carboxy- terminus , an SH3 domain, a poly- 
proline domain several SH2 binding motifs, a serine 
rich region, and the carboxy- terminal effector domain. 

A human clone that encodes an exemplary 
signal mediator protein of the invention is sometimes 

25 referred to herein as "HEFl" (human enhancer of 

f ilamentat ion) to reflect the screening method by which 
it was in part identified. The nucleotide sequence of 
HEFl is set forth herein as Sequence I.D. No. 1. The 
signal mediator protein encoded by HEFl is sometimes 

30 referred to herein as hSMP . The amino acid sequence 

deduced from Sequence I.D. No. 1 is set forth herein as 
Sequence I.D. No. 2. The characteristics of human SMP 
are described in greater detail in Example 1 . 

It is believed that Sequence I.D. No. 1 

35 constitutes a full-length SMP-encoding clone as it 
contains a suitable methionine for initiation of 
translation. This cDNA is approximately 3.7 kb in 
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length. Northern analysis of a human mult i -tissue RNA 
blot (Clontech MTNI) suggests a full-length transcript 
of approximately 3.4 kb. A second transcript of 
approximately 5,4 kb was also observed, which may 
5 represent an alternative splice or initiation site. 

Although the human SMP- encoding gene, HEFl, 
is described and exemplified herein, this invention is 
intended to encompass nucleic acid sequences and 
proteins from other species that are sufficiently 

10 similar to be used interchangeably with SMP-encoding 

nucleic acids and proteins for the research, diagnostic 
and therapeutic purposes described below. Because of 
the high degree of conservation of genes encoding 
specific signal transducers and related oncogenes, it 

15 will be appreciated by those skilled in the art that, 
even if the interspecies SMP homology is low, SMP- 
encoding nucleic acids and SMP proteins from a variety 
of mammalian species should possess a sufficient degree 
of homology with SMP so as to be interchangeably useful 

20 with SMP in such diagnostic and therapeutic 

applications. Accordingly, the present invention is 
drawn to mammalian SMP-encoding nucleic acids and SMP 
proteins, preferably to SMP of primate origin, and most 
preferably to SMP of human origin. Accordingly, when 

25 the terms "signal mediator protein" or "SMP" or "SMP- 
encoding nucleic acid" are used herein, they are 
intended to encompass mammalian SMP-encoding nucleic 
acids and SMPs falling within the confines of homology 
set forth below, of which hSMP, preferably encoded by 

3 0 HEFl, is an exemplary member. 

Allelic variants and natural mutants of 
Sequence I.D. No. 1 are likely to exist within the 
human genome and within the genomes of other mammalian 
species. Because such variants are expected to possess 

35 certain differences in nucleotide and amino acid 

sequence, this invention provides an isolated nucleic 
acid molecule and an isolated SMP protein having at 
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least about 50-60% (preferably 60-80%, most preferably 
over 8 0%) sequence homology in the coding region with 
the nucleotide sequence set forth as Sequence I.D. No, 
1 (and, preferably, specifically comprising the coding 
5 region of sequence I.D. No. 1), and the amino acid 
sequence of Sequence I.D. No. 2. Because of the 
natural sequence variation likely to exist among signal 
mediator proteins and nucleic acids encoding them, one 
skilled in the art would expect to find up to about 40- 

10 50% sequence variation, while still maintaining the 

unique properties of the SMP of the present invention. 
Such an expectation is due in part to the degeneracy of 
the genetic code, as well as to the known evolutionary 
success of conservative amino acid sequence variations, 

15 which do not appreciably alter the nature of the 

protein. Accordingly, such variants are considered 
substantially the same as one another and are included 
within the scope of the present invention. 

For purposes of this invention, the term 

20 "substantially the same" refers to nucleic acid or 

amino acid sequences having sequence variation that do 
not materially affect the nature of the protein (i.e. 
the structure and/or biological activity of the 
protein) . With particular reference to nucleic acid 

25 sequences, the term "substantially the same" is 

intended to refer to the coding region and to conserved 
sequences governing expression, and refers primarily to 
degenerate codons encoding the same amino acid, or 
alternate codons encoding conservative substitute amino 

30 acids in the encoded polypeptide. With reference to 

amino acid sequences, the term "substantially the same" 
refers generally to conservative substitutions and/or 
variations in regions of the polypeptide not involved 
in determination of structure or function. The terms 

35 "percent identity" and "percent similarity" are also 

used herein in comparisons among amino acid sequences . 
These terms are intended to be defined as they are in 
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the UWGCG sequence analysis program (Devereaux et al , , 
Nucl. Acids Res. 12.: 387-397, 1984), available from the 
Unversity of Wisconsin. 

The following description sets forth the 
5 general procedures involved in practicing the present 
invention. To the extent that specif ic . materials are 
mentioned, it is merely for purposes of illustration 
and is not intended to limit the invention. Unless 
otherwise specified, general cloning procedures, such 
10 as those set forth in Sambrook et al . , Molecular 
Cloning , Cold Spring Harbor Laboratory (1989) 
(hereinafter "Sambrook et al . " ) are used. 

I. Preparation of SMP-Encoding Nucleic Acid Molecules, 
15 Signal Mediator Proteins and Antibodies Thereto 

A. Nucleic Acid Molecules 

Nucleic acid molecules encoding the SMPs of 
the invention may be prepared by two general methods: 

20 (1) They may be synthesized from appropriate nucleotide 
triphosphates, or (2) they may be isolated from 
biological sources. Both methods utilize protocols 
well known in the art. 

The availability of nucleotide sequence 

25 information, such as the full length cDNA having 

Sequence I.D. No. 1, enables preparation of an isolated 
nucleic acid molecule of the invention by 
oligonucleotide synthesis. Synthetic oligonucleotides 
may be prepared by the phosphoramadite method employed 

30 in the Applied Biosystems 38A DNA Synthesizer or 
similar devices. The resultant construct may be 
purified according to methods known in the art, such as 
high performance liquid chromatography (HPLC) . Long, 
double-stranded polynucleotides, such as a DNA molecule 

35 of the present invention, must be synthesized in 

stages, due to the size limitations inherent in current 
oligonucleotide synthetic methods. Thus, for example, 
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a 3.7 kb double-stranded molecule may be synthesized as 
several smaller segments of appropriate 

complementarity. Complementary segments thus produced 
may be annealed such that each segment possesses 
appropriate cohesive termini for attachment of an 
adjacent segment. Adjacent segments may be ligated by 
annealing cohesive termini in the presence of DNA 
ligase to construct an entire 3.7 kb double -stranded 
molecule. A synthetic DNA molecule so constructed may 
then be cloned and amplified in an appropriate vector. 

Nucleic acid sequences encoding SMP may be 
isolated from appropriate biological sources using 
methods known in the art. In a preferred embodiment, a 
cDNA clone is isolated from an expression library of 
human origin. In an alternative embodiment, human 
genomic clones encoding SMP may be isolated. 
Alternatively, cDNA or genomic clones encoding from 
other mammalian species may be obtained. 

In accordance with the present invention, 
nucleic acids having the appropriate level sequence 
homology with the protein coding region of Sequence 
I.D. No, 1 may be identified by using hybridization and 
washing conditions of appropriate stringency. For 
example, hybridizations may be performed, according to 
the method of Sambrook et al . , using a hybridization 
solution comprising: 5X SSC, 5X Denhardt ' s reagent, 
1.0% SDS, 100 /xg/ml denatured, fragmented salmon sperm 
DNA, 0.05% sodium pyrophosphate and up to 50% 
formamide. Hybridization is carried out at 37-42 ®C for 
at least six hours. Following hybridization, filters 
are washed as follows: (1) 5 minutes at room 
temperature in 2X SSC and 1% SDS; (2) 15 minutes at 
room temperature in 2X SSC and 0.1% SDS; (3) 3 0 
minutes- 1 hour at 37*^C in IX SSC and 1% SDS; (4) 2 
hours at 42-65*^in IX SSC and 1% SDS, changing the 
solution every 30 minutes. 

Nucleic acids of the present invention may be 
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maintained as DNA in any convenient cloning vector. In 
a preferred embodiment, clones are maintained in 
plasmid cloning/expression vector, such as pBluescript 
(Stratagene, La Jolla, CA) , which is propagated in a 
5 suitable E. coli host cell. 

SMP-encoding nucleic acid molecules of the 
invention include cDNA, genomic DNA, RNA, and fragments 
thereof which may be single- or double-stranded. Thus, 
this invention provides oligonucleotides (sense or 

10 antisense strands of DNA or RNA) having sequences 

capable of hybridizing with at least one sequence of a 
nucleic acid molecule of the present invention, such as 
selected segments of the cDNA having Sequence I.D. No. 
1. Such oligonucleotides are useful as probes for 

15 detecting SMP genes in test samples of potentially 

malignant cells or tissues, e.g. by PGR amplification, 
or for the isolation of homologous regulators of 
morphological control . 

2 0 B . Proteins 

A full-length SMP of the present invention 
may be prepared in a variety of ways, according to 
known methods. The protein may be purified from 
appropriate sources, e.g., human or animal cultured 
25 cells or tissues, by immunoaf f inity purification. 

However, this is not a preferred method due to the low 
amount of protein likely to be present in a given cell 
type at any time. 

The availability of nucleic acids molecules 

3 0 encoding SMP enables production of the protein using in 

vitro expression methods known in the art. For 
example, a cDNA or gene may be cloned into an 
appropriate in vitro transcription vector, such a pSP64 
or pSP65 for in vitro transcription, followed by cell- 
35 free translation in a suitable cell-free translation 

system, such as wheat germ or rabbit reticulocytes. In 
vitro transcription and translation systems are 
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commercially available, e.g., from Promega Biotech, 
Madison, Wisconsin or BRL, Rockville, Maryland. 

Alternatively, according to a preferred 
embodiment, larger quantities of SMP may be produced byv 
5 expression in a suitable procaryotic or eucaryotic 
system. For example, part or all of a DNA molecule, 
such as the cDNA having Sequence I.D. No. 1, may be 
inserted into a plasmid vector adapted for expression 
in a bacterial cell, such as E. coli, or into a 

10 baculovirus vector for expression in an insect cell. 

Such vectors comprise the regulatory elements necessary 
for expression of the DNA in the bacterial host cell, 
positioned in such a manner as to permit expression of 
the DNA in the host cell. Such regulatory elements 

15 required for expression include promoter sequences, 
transcription initiation sequences and, optionally, 
enhancer sequences . 

The SMP produced by gene expression in a 
recombinant procaryotic or eucyarotic system may be 

20 purified according to methods known in the art. In a 
preferred embodiment, a commercially available ■ 
expression/secretion system can be used, whereby the 
recombinant protein is expressed and thereafter 
secreted from the host cell, to be easily purified from 

25 the surrounding medium. If expression/secretion 

vectors are not used, an alternative approach involves 
purifying the recombinant protein by affinity 
separation, such as by immunological interaction with 
antibodies that bind specifically to the recombinant 

3 0 protein. Such methods are commonly used by skilled 
practitioners . 

The signal mediator proteins of the 
invention, prepared by the aforementioned methods, may 
be analyzed according to standard procedures . For 

35 example, such proteins may be subjected to amino acid 
sequence analysis, according to known methods. 

The present invention also provides 
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antibodies capable of immunospecif ically binding to 
proteins of the invention. Polyclonal antibodies 
directed toward SMP may be prepared according to 
standard methods. In a preferred embodiment, 
5 monoclonal antibodies are prepared, which react 
immunospecif ically with various epitopes of SMP. 
Monoclonal antibodies may be prepared according to 
general methods of Kohler and Milstein, following 
standard protocols. Polyclonal or monoclonal 

10 antibodies that immunospecif ically interact with SMP 
can be utilized for identifying and purifying such 
proteins. For example, antibodies may be utilized for 
affinity separation of proteins with which they 
immunospecif ically interact. Antibodies may also be 

15 used to immunoprecipitate proteins from a sample 

containing a mixture of proteins and other biological 
molecules. Other uses of anti-SMP antibodies are 
described below. 

20 II. Uses of SMP-Encoding Nucleic Acids, Signal 
Mediator Proteins and Antibodies Thereto 

Cellular signalling molecules have received a 
great deal of attention as potential prognostic 
indicators of neoplastic disease and as therapeutic 

25 agents to be used for a variety of purposes in cancer 
chemotherapy. As a signalling molecule that induces 
profound morphological changes, SMP and related 
proteins from other mammalian species promise to be 
particularly useful research tools, as well as 

30 diagnostic and therapeutic agents, 

A. SMP-Encoding Nucleic Acids 

SMP-encoding nucleic acids may be used for a 
variety of purposes in accordance with the present 
3 5 invention. SMP-encoding DNA, RNA, or fragments thereof 
may be used as probes to detect the presence of and/or 
expression of genes encoding SMP. Methods in which 
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SMP- encoding nucleic acids may be utilized as probes 
for such assays include, but are not limited to: (1) in 
situ hybridization; (2) Southern hybridization (3): 
northern hybridization; and (4) assorted amplification 
5 reactions such as polymerase chain reactions (PGR) . 

The SMP-encoding nucleic acids of the 
invention may also be utilized as probes to identify 
related genes either from humans or from other species . 
As is well known in the art, hybridization stringencies 

10 may be adjusted to allow hybridization of nucleic acid 
probes with complementary sequences of varying degrees 
of homology. Thus, SMP-encoding nucleic acids may be 
used to advantage to identify and characterize other 
genes of varying degrees of relation to SMP, thereby 

15 enabling further characterization the signalling 
cascade involved in the morphological control of 
different cell types. Additionally, they may be used 
to identify genes encoding proteins that interact with 
SMP (e.g., by the "interaction trap" technique), which 

20 should further accelerate elucidation of these cellular 
signalling mechanisms. 

Nucleic acid molecules, or fragments thereof, 
encoding SMP may also be utilized to control the 
expression of SMP, thereby regulating the amount of 

25 protein available to participate in oncogenic 

signalling pathways. Alterations in the physiological 
amount of "adapter protein" may act synergist ically 
with chemotherapeutic agents used to treat cancer. In 
one embodiment, the nucleic acid molecules of the 

30 invention may be used to decrease expression of SMP in 
a population of malignant cells. In this embodiment, 
SMP proteins would be unable to serve as substrate 
acceptors for phosphorylation events mediated by 
oncogenes thereby effectively abrogating the activation 

35 signal. In this embodiment, antisense oligonucleotides 
are employed which are targeted to specific regions of 
SMP-encoding genes that are critical for gene 
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expressxon. The use of antisense oligonucleotides to 
decrease expression levels of a pre -determined gene is 
known xn the art. m a preferred embodiment, slh 
ant.sense oligonucleotides are modified in various ways 
5 to increase their stability and membrane permeability 
so as to maximize their effective delivery to target 
cells .n vat.o and in vivo. Such modifications iLlude 
the preparation of phosphorothioate or 
methylphosphonate derivatives, among many others, 
10 according to procedures known in the art. 

In another embodiment, overexpression of SMP 
xnduced in a target population of cells to generate 
an excess of signal adapter molecules. This excess 
allows SMP to serve as a phosphorylation "sink" for the 
15 kinase activity of transforming oncogenes 

overexpression of SMP could lead to alterations in the 
cytoskeleton which could then be monitored with 
immunofluorescence or any other standard technique 

20 TZ'^ '"^^ ^^^--^--ly' overexpression of SMP 

20 by this method may facilitate the isolation and 

characterization of other components involved in the 
protein-protein complex formation that occurs via the 
SH2 homology domains during signal transduction 

described above, SMP-encoding nucleic 
acids are also used to advantage to produce large 
quantities of substantially pure SMP protein, or 
selected portions thereof, m a preferred embodiment, 
the C-terminal "effector domain" of SMP is produced by 

full-length protein or selected domain is thereafter 
used for various research, diagnostic and therapeutic 
purposes, as described below. 



BNS0OCI0:<WO 9702362A1> 



wo 97/02362 



PCT/US96/10823 



- 21 - 

B . Signal Mediator Protein and Antibodies 

Purified SMP, or fragments thereof, may be 
used to produce polyclonal or monoclonal antibodies 
which also may serve as sensitive detection reagents 
5 for the presence and accumulation of SMP (or complexes 
containing SMP) in cultured cells or tissues from 
living patients (the term "patients" refers to both 
humans and animals) . Recombinant techniques enable 
expression of fusion proteins containing part or all of 

10 the SMP protein. The full length protein or fragmeints 
of the protein may be used to advantage to generate an 
array of monoclonal antibodies specific for various 
epitopes of the protein, thereby providing even greater 
sensitivity for detection of the protein in cells or 

15 tissue. 

Polyclonal or monoclonal antibodies 
immunologically specific for SMP may be used in a 
variety of assays designed to detect and quantitate the 
protein, which may be useful for rendering a prognosis 

20 as to a malignant disease. Such assays include, but 
are not limited to: (1) flow cytometric analysis; (2) 
immunochemical localization in SMP in cultured cells or 
tissues; and (3) immunoblot analysis (e.g., dot blot. 
Western blot) of extracts from various cells and 

25 tissues. Additionally, as described above, anti-SMP 
can be used for purification of SMP (e.g., affinity 
column purification, immunoprecipi tat ion) . 

Anti-SMP antibodies may also be utilized as 
therapeutic agents to block the normal functionality of 

30 SMP in a target cell population, such as a tumor. 
Thus, similar to the antisense oligonucleotides 
described above, anti-SMP antibodies may be delivered 
to a target cell population by methods known in the art 
(i.e. through various lipophilic carriers that enable 

35 delivery of the compound of interest to the target cell 
cytoplasm) where the antibodies may interact with 
intrinsic SMP to render it nonfunctional. 
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From the foregoing discussion, it can be seen 
that SMP-encoding nucleic acids and SNIP proteins of the 
invention can be used to detect SMP gene expression and 
protein accumulation for purposes of assessing the 
5 genetic and protein interactions involved in the 

regulation of morphological control pathways of a cell 
or tissue sample. Aberrant morphological changes are 
often correlatable with metastatic cellular 
proliferation in various cancers, such as breast 

10 cancer. It is expected that these tools will be 

particularly useful for diagnosis and prognosis of 
human neoplastic disease. Potentially of greater 
significance, however, is the utility of SMP-encoding 
nucleic acids, proteins and antibodies as therapeutic 

15 agents to disrupt the signal transduction pathways 

mediated by activated oncogenes that result in aberrant 
morphological cellular alterations , 

Although the compositions of the invention 
have been described with respect to human diagnostics 

20 and therapeutics, it will be apparent to one skilled in 
the art that these tools will also be useful in animal 
and cultured cell experimentation with respect to 
various malignancies and/or other conditions manifested 
by alterations in cellular morphology. As diagnostic 

25 agents they can be used to monitor the effectiveness of 
potential anti-cancer agents on signal transduction 
pathways mediated by oncogenic proteins in vitro, 
and/or the development of neoplasms or malignant 
diseases in animal model systems. As therapeutics, 

30 they can be used either alone or as adjuncts to other 
chemotherapeutic drugs in animal models and veterinary 
applications to improve the effectiveness of such anti- 
cancer agents . 

The following Example is provided to describe 

35 the invention in further detail. This Example is 

intended to illustrate and not to limit the invention. 
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EXAMPLE 1 

Isolation and Characterization of a 
Nucleic Acid Molecule Encoding Human SMP 

In this Example, we describe the cloning of a 

5 cDNA molecule encoding human SMP. This cDNA is 

sometimes referred to herein as HEFl for human enhancer 

of f^ilamentation, because of its identification in the 

pseudohyphal screen. We also provide an analysis of 

the structure of the human SMP (hSMP) as predicted from 

10 the deduced amino acid sequence encoded by the cDNA. 

Additionally, we describe the antibodies immunospecif ic 

for the recombinant hSMP protein, and their use in 

immunological detection of phosphorylated SMP from 

normal and Abl transformed NIH3T3 cells. 

15 

Isolation of cDNA and cloning 

A HeLa cDNA library constructed in the 
TRPl+vector JG4-4 (Gyuris et al . , Cell 75:791-803), was 
translated with inserts expressed as native proteins 

20 under the control of the galactose - inducible GALl 
promoter, into CGx74 yeast (MATa/a trpl/trpl; see 
Gimeno et al . , 1992, supra), TRP+ t ransf ormant s were 
plated to the nitrogen-restricted SLAGR medium (like 
SLAD, but with 2% galactose, 1% raffinose as a carbon 

25 source), and 120,000 colonies were visually screened 

using a Wild dissecting microscope at 50x amplification 
to identify colonies that produced pseudohyphae more 
extensively than background. cDNAs from these colonies 
were isolated and retransf ormed to naive CGx74 ; those 

3 0 that reproducibly generated enhanced pseudohyphae were 
sequenced. A 9 00 bp cDNA encoding a 18 2 amino acid 
open reading frame corresponding to the COOH- terminus 
of hSMP (HEFl-Cterm 182) possessed the most dramatic 
phenotype of cDNA obtained in this screen. Using the 

35 original 900 bp cDNA isolated in the pseudohyphal 
screen to probe a placental cDNA library cloned in 
lambda gtll, a larger clone (3.4 kb) was isolated. The 
longer clone obtained in this screen was used as a 
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basis for 5' RACE using a kit from Clontech containing 
RACE -ready cDNA prepared from human kidney. Three 
independent clones from the RACE approach yielded 
identical 5' end-points located 18 base pairs upstream 
5 of the ATG encoding the first methionine in the 

sequence shown in Figure 1. Repeated efforts with 
multiple primer sets showed no evidence for an N- 
terminally extended sequence. The full length clone, 
HEFl, is about 3.7 kb and encodes a protein about 83 5 
10 amino acids in length. 

Secmence Analysis 

Both strands of the HEFl clone were sequenced 
using oligonucleotide primers to the JG4-4 vector and 

15 to internal HEFl sequences in combination with the 

Sequenase system (United States Biochemical) Database 
searching was performed using the BLAST algorithm 
(Altschul et al., J, Mol . Biol. 215:403-410, 1990) and 
sequence analysis was carried out using the package of 

20 programs from UWGCG (Devereux et al . , Nucl . Acids Res. 
12 : 387-397 , 1984 ) . 

Northern Analysis 

HEFl cDNA was labelled with ^^P-dCTP by random 
25 priming, and used to probe a Northern blot containing 2 
Mg/lane human mRNA from multiple tissues. The blot was 
stripped and reprobed with a -^^P- labelled 
oligonucleotide specific for actin as a control for 
equivalent loading . 

30 

Immunoprecipitation and Western Blotting 

Immunoprecipitation of hSMP from normal and 
Abl transformed NIH 3T3 cells was accomplished using 
polyclonal antiserum raised against a peptide derived 
35 from the hSMP C-terminus. Immunoprecipitates were 

resolved by electrophoresis on a 12% SDS-polyacrylamide 
gel. Following electrophoresis, immunoprecipitates were 
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transferred to nitrocellulose, and reprobed with anti- 
phosphotyrosine antibody (4G10) . 

Growth Profiles 

5 Yeast were transformed with HEFl or vector 

alone and grown to saturated overnight cultures in trp" 
glucose defined minimal medium, and re-diluted to OD600 
<0.05 in trp* galactose for growth curves. Growth 
curves were performed, with readings taken at 90 minute 
10 intervals for 12 hours, and at less frequent intervals 
up to 4 8 hours or longer. 

Interaction Trap or Two Hybrid Analysis 

EGY48 yeast (Gyuris et al . , 1993, supra) were 

15 transformed by standard methods with plasmids 

expressing LexA- fusions , activation-domain fusions, or 
both, together with the LexA operator-LacZ reporter 
SH18-34 {Gyuris et al . , 1993, supra). For all fusion 
proteins, synthesis of a fusion protein of the correct 

20 length in yeast was confirmed by Western blot assays of 
yeast extracts (Samson et al . , Cell 32' 1045-1052, 
1989) using polyclonal antiserum specific for LexA 
(Brent and Ptashne , Nature 312 : 612-615, 1984) or for 
hemagglutinin (Babco, Inc) , as appropriate. Activation 

25 of the LacZ reporter was determined as previously 

described (Brent and Ptashne, Cell 43.: 729-736, 1985) . 
Beta-galactosidase assays were performed on three 
independent colonies, on three separate occasions, and 
values for particular plasmid combinations varied less 

30 than 25%. Activation of the LEU2 reporter was 

determined by observing the colony forming ability of 
yeast plated on complete minimal medium lacking 
leucine. The LexA-PRD/HD expressing plasmid has been 
described (Golemis and Brent, Mol . Cell Biol. 12.: 3006- 

35 3014, 1992) . 
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RESULTS 

Qverexp ression of the C-terminal domain of 
SMP influences Saccharomvces cerevisi^e cell 
morpholocp^. To identify proteins that regulate the 
.5 morphology and polarity of human cells, a human cDNA 

library was screened for genes which enhanced formation 
of pseudohyphae when expressed in S. cerevisia^e . The 
yeast undergoes a dimorphic shift in response to severe 
nitrogen limitation that involves changes in budding 

10 pattern, cell cycle control, cell elongation, and 

invasive growth into agar (Gimeno et al . , 1992, supra), 
A galactose -inducible HeLa cell cDNA library was used 
to transform a yeast strain that can form pseudohyphae 
on nitrogen-restricted media, and a number of human 

15 genes which specifically enhanced pseudohyphal 

formation were identified. One of the cDNAs derived 
from this screen was found to cause the constitutive 
formation of pseudohyphae on rich and nitrogen 
restricted media. This cDNA is sometimes referred to 

20 as "HEF1-Cterml82" (because it encodes 182 amino acids 
of the C-terminal domain of the human SMP) . A full- 
length clone containing the cDNA sequence was 
thereafter obtained. Analysis of the sequence of this 
cDNA (Sequence I.D. No. 1; Figure 1) revealed that it 

25 was a novel human gene with strong sequence similarity 
to the rat pl30cas gene (as disclosed by Sakai et al . 
EMBO J. 13: 3748-3756, 1994) . This gene was designated 
HEFl, and its encoded protein was designated hSMP 
(Sequence I.D. No. 2) . A comparison of the amino acid 

30 compositions (% by weight) of the HEFl - encoded hSMP and 
the rat pl30cas is shown in Table 1 below. 
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Amino Acid % Composition 



Alanine 


hSMP 


Pl30cas 


4 . 3 


6 


. 2 


Arginine 


6 . 1 


7 


. 5 


Asparagine 


4 . 1 


1 , 


. 8 


Aspart ic acid 


5 . 6 


6 , 


. 5 


Cysteine 


1 . 5 


0 , 


. 6 


Glut amine 


8 . 3 


8 , 


. 1 


Glutamic acid 


6.6 


5 , 


. 8 


Glycine 


3 . 5 


4 . 


. 5 


Histidine 


4 . 0 


3 , 


. 1 


Isoleucine 


4 . 2 


1 . 


, 6 


Leucine 


8 . 7 


9 , 


. 6 


Lysine 


6 . 2 


4 . 


. 8 


Methionine 


2 . 8 


1 . 


. 0 


Phenylalanine 


3 .2 


1 . 


. 6 


Proline 


.7 . 0 


11 , 


. 1 


Serine 


6.6 


6 . 


, 7 


Threonine 


4 . 8 


4 . 


, 9 


Tryptophan 


1 . 1 


1 . 


, 1 


Tyrosine 


4 . 8 


4 , 


, 7 


Valine 


5 . 6 


7 , 


, 7 



The deduced length of HEFl-encoded hSMP is 
30 834 amino acids and its deduced molecular weight is 
about 107,897 Da. The deduced length of the rat 
pl30cas is 968 amino acids and its deduced molecular 
weight is about 121,421 Da. 

3 5 Tissue specific expression of HEFl . RNA 

production was assessed by Northern blot analysis. 
HEFl is expressed as two predominant transcripts of 
approximately 3.4 and 5.4 kb . Although present in all 
tissues examined (heart, brain, placenta, lung, liver, 

40 skeletal muscle, kidney and pancreas) , these 

transcripts are present at significantly higher levels 
in kidney, lung, and placenta. In contrast, a more 
uniform distribution throughout the body has been 
reported for pl30cas. Two other cross-hybridizing 

45 minor species were detected, migrating at 8.0 kb in 
lung and 1.2 kb in liver. These may represent 
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alternatively spliced HEFl transcripts or other 
HEFl/pl30cas related genes. HEFl represents a distinct 
gene from piaOcas rather than a human homolog, inasmuch 
as a screen of a murine genomic library with HEFl cDNA 
5 led to identification of an exon that encoded a mouse 
C-termmal effector protein having a sequence 
essentially identical to hSMP-Cterml82 (Figure 3) 
Furthermore, probe of a zoo blot at high stringency 
with a HEFl CDNA probe indicates this gene is highly 
conserved from humans to yeast. 

hSMP doe s not induCP^ r-r^ngtifcuf -i 

pseudohyphal hndding hv a^u^ir^r, ..y^re 1 ..^^^^ 
The possibility that the C-terminal domain of hSMP was 
enhancing pseudohyphae formation by causing severe cell 
stress was excluded by comparing the growth rates of 
yeast containing the HEFl-cterml82 cDNA to yeast 
containing the expression vector control on plates and 
in liquid culture, with galactose as a sugar source to 
induce expression of HEFl -cterml82 . The growth rate 
data shows that SMP-encoding genes are not simply toxic 
to yeast. 



25 



30 



35 



SMP belongs to a riass of ■■.H.p ^^^ nT-o^^^nc.■■ 

importanr in signalling cr^.H». i nfln^nn^^^ 

morpho logical contr ol tHp hppi r^^^^ ■ 

^^'-■^^1 ; ■ ine HEFl gene is approximately 

3.7 kb and encodes a single continuous open reading 

frame of about 835 amino acids. The predicted hSMP 

protein notably contains an amino- terminal SH3 domain 

and an adjacent domain containing multiple SH2 binding 

motifs. Homology search of the Genbank database 

revealed that hSMP is 64% similar at the amino acid 

level to the adapter protein pl30cas, recently cloned 

from rat (Sakai et al . , embo J. 13:3748-3756, 1994). 

The amino acid alignment of hSMP and pl30cas is shown 

in Figure 2. Pl30cas was determined to be the 

predominant phosphorylated species in cells following 
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transformation by the oncoprotein Crk and also 
complexes with, and is a substrate for Abl and Src . As 
shown in Table 2 below, the homology between SMP and 
pl3 0cas is most pronounced over the SH3 domain (92% 
5 similarity, 74% identity) and in the region 

corresponding to the SMP-Cterml82 fragment (74% 
similarity, 57% identity) . Although the domain 
containing SH2 -binding motifs is more divergent from 
pl30cas, SMP similarly possesses a large number of 

10 tyrosines in this region. The majority of SH2 binding 
sites in plBOcas match the consensus for the SH2 domain 
of the oncoprotein Crk, while the amino acids flanking 
the tyrosine residues in SMP are more jiiverse , 
suggesting a broader range of associating proteins. 

15 Various SH2 binding motifs conserved between hSMP and 
plBOcas are shown in Table 3. 
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TABLE 2 

Domain Alignment: hSMP and pl30cas 

(Domains from amino to carboxyl terminus down the Table) 





Domain 


Size 
hSMP 


(a. a. ) 
plBOcas 


% Similarity/Identity 
(hSMP : plBOcas) 


10 


SH3 


50 


50 


92% similar, 
74% indentical 


15 


Polyproline 


10 


38 


(not compared) 




SH2 binding 
motifs 


290 


410 


55% similar, 
36% identical 


20 


Serine-rich 
region 


250 


260 


56% similar, 
35% identical 


25 


C- terminal 
effector domain 


210 


210 


74% similar, 
57% identical 



30 TABLE 3 

Conserved SH2 Binding Motifs and Associating Proteins 



35 



SH2 Binding Motif 



Associating Proteins 



40 



YDIP 
YDVP 
YDFP 

YEYP 
YAIP 
YQNQ 



Crk 



Vav or fps/fes 

Abl 

Grb2 



45 



50 



YQVP 
YQKD 
YVYE 
YPSR 
YNCD 



Novel 
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The enhancement of pseudohyphal formation by 
hSMP-Cterml82 fragment in addition to the relatively high 
degree of homology to pl30cas suggests that this domain 
5 acts as an effector in regulating cellular morphology. A 
test was performed to assay whether the homologous region 
of plBOcas also enhanced pseudohyphal formation. The 
results show that the C- terminal fragment of pl30cas did 
enhance psuedohyphal formation but not to the same extent 

10 as the C- terminal fragment of SMP . SMP was found to 

induce the strongest pseudohyphal phenotype of only cDNA 
fragment. By comparison, plBOcas and another 
pseudohyphal inducer, RBP7 (subunit 7 of human RNA 
polymerase II, Golemis et al . , Mol . Biol, of the Cell, 

15 1995, in press) were only about 60% as effective as the 
hSMP-Cterml82 fragment . 

The possible functions for the novel carboxy- 
terminal domains were investigated further using two- 
hybrid analysis. These experiments revealed that this 

20 domain mediated SMP homodimerization , and SMP/pl30cas 
heterodimerization , yet failed to interact with non- 
specific control proteins. 

SMP is a substrate for oncogene mediated 
25 phosphorvlat ion . SMP was immunoprecipitated from normal 
and v-Abl transformed NIH3T3 cells using polyclonal 
antisera raised against a MAP peptide derived from the 
hSMP C- terminal domain. Probe of these 
immunoprecipitates with antibody to phosphotyrosine 
3 0 revealed a species migrating at approximately 13 0-140 kD 
that was specifically observed in Abl- transformed 
fibroblasts. This species may represent SMP 
phosphorylated by Abl, as SMP possesses a good match to 
SH2 binding domain recognized by Abl . The larger 
3 5 apparent molecular weight as compared with hSMP deduced 
molecular weight may reflect glycosylation or may be a 
result of its phosphorylated state. 
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SMP dimerizPs with oM.o r iInnnT-^^n^ cenm;.>- 
requlatory p rof Pins To assay whether SMP dimerizes with 
other cellular proteins, the interaction trap/two hybrid 
analysis system was used. Briefly, a LexA- fusion and an 
5 epitope-tagged, activation-domain fusion to SMP were 

synthesized. The expression of proteins of the predicted 
size in yeast was confirmed using antibodies specific for 
the fusion moieties. Using a LexA-operator reporter, it 
was observed that LexA-SMP fusion protein activates 
transcription extremely weakly. However, LexA-SMP is 
able to interact with co-expressed activation domain- 
fused SMP to activate transcription of the reporter, 
indicating that it is able to form dimers (or higher 
order multimers) . 

SMP joins pl30cas in defining a new family of 
docking adapters that, through multiple associations with 
signalling molecules via SH2 binding domains, is likely 
to coordinate changes in cellular growth regulation. The 
interactions between SMP homodimers and SMP-pi30cas 
heterodimers may negatively regulate SMP and pl3 0cas 
proteins by making them inaccessible to their targets. 
Alternatively, SMP and pl30cas could work together to 
recruit new proteins to the signalling complex. The fact 
that the novel C- terminal domain shared between SMP and 
pl30cas has the ability to cause pseudohyphal formation 
m yeast suggests that these proteins may directly alter 
cellular morphology by interacting with the cytoskeleton . 
In fact, previous yeast -morphology based screens for 
higher eucaryotic proteins have tended to isolate 
30 cytoskeletally related proteins. This invention 

therefore provides reagents influencing the changes in 
cell morphology that accompany oncoprotein-mediated 
transformation in carcinogenesis. 

The present invention is not limited to the 
embodiments specifically described above, but is capable 
of variation and modification without departure from the 
scope of the appended claims. 



20 



25 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANT: Golemis, Erica A. 

Law, Susan F. 
Estojak, JoAnne 

10 (ii) TITLE OF INVENTION: NUCLEIC ACID MOLECULE ENCODING A 

SIGNAL MEDIATOR PROTEIN THAT INDUCES CELLULAR 
MORPHOLOGICAL ALTERATIONS 



(iii) NUMBER OF SEQUENCES: 4 



(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Dann, Dorfman, Herrell and Skillman 

(B) STREET: 1601 Market Street Suite 720 

(C) CITY: Philadelphia 

2 0 (D) STATE: PA 

(E) COUNTRY: USA 

(F) ZIP: 19103-2307 

(v) COMPUTER READABLE FORM: 
25 (A) MEDIUM TYPE: Floppy disk 

{B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS -DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.30 

3 0 (vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 30-June-1995 

(C) CLASSIFICATION: 

3 5 (viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Reed, Janet E. 

(B) REGISTRATION NUMBER: 36,252 

(ix) TELECOMMUNICATION INFORMATION: 
40 (A) TELEPHONE: (215) 563-4100 

(B) TELEFAX: (215) 563-4044 



(2) INFORMATION FOR SEQ ID NO : 1 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3672 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
50 (D) TOPOLOGY: not relevant 



(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

ACCCCCACGC TACCGAAATG AAGTATAAGA ATCTTATGGC AAGGGCCTTA TATGACAATG 60 

TCCCAGAGTG TGCCGAGGAA CTGGCCTTTC GCAAGGGAGA CATCCTGACC GTCATAGAGC 12 0 

65 AGAACACAGG GGGACTGGAA GGATGGTGGC TGTGCTCGTT ACACGGTCGG CAAGGCATTG 18 0 

TCCCAGGCAA CCGGGTGAAG CTTCTGATTG GCCCCATGCA GGAGACTGCC TCCAGTCACG 24 0 
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AGCAGCCTGC CTCTGGACTG ATGCAGCAGA CCTTTGGCCA ACAGAAGCTC TATCAAGTGC 3 00 

CAAACCCACA GGCTGCTCCC CGAGACACTA TCTACCAAGT GCCACCTTCC TACCAAAATC 360 
AGGGAATTTA CCAAGTCCCC ACTGGCCACG GCACCCAAGA ACAAGAGGTA TATCAGGTGC 420 
CACCATCAGT GCAGAGAAGC ATTGGGGGAA CCAGTGGGCC CCACGTGGGT AAAAAGGTGA 4 80 

TAACCCCCGT GAGGACAGGC CATGGCTACG TATACGAGTA CCCATCCAGA TACCAAAAGG 540 
ATGTCTATGA TATCCCTCCT TCTCATACCA CTCAAGGGGT ATACGACATC CCTCCCTCAT 6 00 

CAGCAAAAGG CCCTGTGTTT TCAGTTCCAG TGGGAGAGAT AAAACCTCAA GGGGTGTATG 660 
15 ACATCCCGCC TACAAAAGGG GTATATGCCA TTCCGCCCTC TGCTTGCCGG GATGAAGCAG 72 0 

GGCTTAGGGA AAAAGACTAT GACTTCCCCC CTCCCATGAG ACAAGCTGGA AGGCCGGACC 7 80 

TCAGACCGGA GGGGGTTTAT GACATTCCTC CAACCTGCAC CAAGCCAGCA GGGAAGGACC 84 0 

20 

TTCATGTAAA ATACAACTGT GACATTCCAG GAGCTGCAGA ACCGGTGGCT CGAAGGCACC 90 0 

AGAGCCTGTC CCCGAATCAC CCACCCCCGC AACTCGGACA GTCAGTGGGC TCTCAGAACG 96 0 

2 5 ACGCATATGA TGTCCCCCGA GGCGTTCAGT TTCTTGAGCC ACCAGCAGAA ACCAGTGAGA 10 2 0 

AAGCAAACCC CCAGGAAAGG GATGGTGTTT ATGATGTCCC TCTGCATAAC CCGCCAGATG 10 8 0 

CTAAAGGCTC TCGGGACTTG GTGGATGGGA TCAACCGATT GTCTTTCTCC AGTACAGGCA 114 0 

30 

GCACCCGGAG TAACATGTCC ACGTCTTCCA CCTCCTCCAA GGAGTCCTCA CTGTCAGCCT 12 00 

CCCCAGCTCA GGACAAAAGG CTCTTCCTGG ATCCAGACAC AGCTATTGAG AGACTTCAGC 12 6 0 

3 5 GGCTCCAGCA GGCCCTTGAG ATGGGTGTCT CCAGCCTAAT GGCACTGGTC ACTACCGACT 13 2 0 

GGCGGTGTTA CGGATATATG GAAAGACACA TCAATGAAAT ACGCACAGCA GTGGACAAGG 13 80 

TGGAGCTGTT CCTGAAGGAG TACCTCCACT TTGTCAAGGG AGCTGTTGCA AATGCTGCCT 14 4 0 

40 

GCCTCCCGGA ACTCATCCTC CACAACAAGA TGAAGCGGGA GCTGCAACGA GTCGAAGACT 15 00 

CCCACCAGAT CCTGAGTCAA ACCAGCCATG ACTTAAATGA GTGCAGCTGG TCCCTGAATA 156 0 

4 5 TCTTGGCCAT CAACAAGCCC CAGAACAAGT GTGACGATCT GGACCGGTTT GTGATGGTGG 16 2 0 

CAAAGACGGT GCCCGATGAC GCCAAGCAGC TCACCACAAC CATCAACACC AACGCAGAGG 16 8 0 

CCCTCTTCAG ACCCGGCCCT GGCAGCTTGC ATCTGAAGAA TGGGCCGGAG AGCATCATGA 174 0 

50 

ACTCAACGGA GTACCCACAC GGTGGCTCCC AGGGACAGCT GCTGCATCCT GGTGACCACA 18 00 

AGGCCCAGGC CCACAACAAG GCACTGCCCC CAGGCCTGAG CAAGGAGCAG GCCCCTGACT 18 6 0 

5 5 GTAGCAGCAG TGATGGTTCT GAGAGGAGCT GGATGGATGA CTACGATTAC GTCCACCTAC 192 0 

AGGGTAAGGA GGAGTTTGAG AGGCAACAGA AAGAGCTATT GGAAAAAGAG AATATCATGA 198 0 

AACAGAACAA GATGCAGCTG GAACATCATC AGCTGAGCCA GTTCCAGCTG TTGGAACAAG 204 0 

6.0 

AGATTACAAA GCCCGTGGAG AATGACATCT CGAAGTGGAA GCCCTCTCAG AGCCTACCCA 2100 

CCACAAACAG TGGCGTGAGT GCTCAGGATC GGCAGTTGCT GTGCTTCTAC TATGACCAAT 216 0 

65 GTGAGACCCA TTTCATTTCC CTTCTCAACG CCATTGACGC ACTCTTCAGT TGTGTCAGCT 2 22 0 

CAGCCCAGCC CCCGCGAATC TTCGTGGCAC ACAGCAAGTT TGTCATCCTC AGTGCACACA 22 8 0 
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AACTGGTGTT 


cattggagac 


ACGCTGACAC 


GGCAGGTGAC 


TGCCCAGGAC 


ATTCGCAACA 


2340 




A A r^T*^ A A A 

AAG i C ATGAA 


ctccagcaac 


CAGCTCTGCG 


AGCAGCTCAA 


GACTATAGTC 


ATGGCAACCA 


2400 


z> 


AGATGGCCGC 


CCTCCATTAC 


CCCAGCACCA 


. CGGCCCTGCA 


GGAAATGGTG 


CACCAAGTGA 


2460 




L.AGACCTTTC 


TAGAAATGCC 


CAGCTGTTCA 


AGCGCTCTTT 


GCTGGAGATG 


GCAACGTTCT 


2520 


10 


/^A^AA^AAAA 

GAGAAGAAAA 


aaaagaggaa 


GGGGACTGCG 


TTAACGGTTA 


CTAAGGAAAA 


CTGGAAATAC 


2580 


T^/~"T'/ 'T/^^^' ("It'll 

i G i 1 GG i i T 


TTGTAAATGT 


TATCTATTTT 


TGTAGATAAT 


TTTATATAAA 


AATGAAATAT 


2640 




1 i lAAGAi IT 


TAT G GGT C AG 


ACAACTTTCA 


GAAATTCAGG 


GAGCTGGAGA 


GGGAAATCTT 


2700 


X —J 


1 i i i i GGGCG 


GTGAGTNGTT 


CTTATGTATA 


CACAGAAGTA 


TCTGAGACAT 


AAACTGTACA 


2760 




A A A A ^T'nn/^'T* 


CCACGTCCTT 


TTGTATGCCC 


ATGTATTCAT 


GTTTTTGTTT 


GTAGATGTTT 


2820 


20 


G T C TG ATG C A 


TTTCATTAAA 


AAAAAAACCA 


TGAATTACGA 


AGCACCTTAG 


TAAGCACCTT 


2880 


CTAATGCTGC 


attttttttg 


TTGTTGTTAA 


AAACATCCAG 


CTGGTTATAA 


TATTGTTCTC 


2940 




CACGTCCTTG 


tgatgattct 


GAGCCTGGCA 


CTGGGAATCT 


GGGAAGCATA 


GTTTATTTGC 


3000 




AAGTGTTCAC 


CTTCCAAATC 


ATGAGGCATA 


GCATGACTTA 


TTCTTGTTTT 


GAAAACTCTT 


3060 




TTCAAAACTG 


ACCATCTTAA 


ACACATGATG 


GCCAAGTGCC 


ACAAAGCCCT 


CTTGCGGAGA 


3120 


30 


CATTTACGAA 


tatatatgtg 


GATCCAAGTC 


TCGATAGTTA 


GGCGTTGGAG 


GGAAGAGAGA 


3180 


CCAGAGAGTT 


tagaggccag 


GACCACAGTT 


AGGATTGGGT 


TGTTTCAATA 


CTGAGAGACA 


3240 




GCTACAATAA 


aaggagagca 


ATTGCCTCCC 


TGGGGCTGTT 


CAATCTTCTG 


CATTTGTGAG 


3300 


J b 


TGGTTCAGTC 


ATGAGGTTTT 


CCAAAAGATG 


TTTTTAGAGT 


TGTAAAAACC 


ATATTTGCAG 


3360 




CAAAGATTTA 


CAAAGGCGTA 


TCAGACTATG 


ATTGTTCACC 


AAAATAGGGG 


AATGGTTTGA 


3420 


40 


TCCGCCAGTT 


GCAAGTAGAG 


GCCTTTCTGA 


CTCTTAATAT 


TCACTTTGGT 


GCTACTACCC 


3480 


ccattacctg 


AGGAACTGGC 


CAGGTCCTTG 


ATCATGGAAC 


TATAGAGCTA 


CCAGACATAT 


3540 




CCTGCTCTCT 


AAGGGAATTT 


ATTGCTATCT 


TGCACCTTCT 


TTAAAACTCA 


AAAAACATAT 


3600 


45 


gcagacctga 


CACTCAAGAG 


TGGCTAGCTA 


CACAGAGTCC 


ATCTAATTTT 


TGCAACTTCC 


3660 




ccccccgaat 


TC 










3672 



50 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 834 amino acids 
55 (B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

60 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Lys Tyr Lys Asn Leu Met Ala Arg Ala Leu Tyr Asp Asn Val Pro 
^15 10 15 

Glu Cys Ala Glu Glu Leu Ala Phe Arg Lys Gly Asp lie Leu Thr Val 
20 25 30 

lie Glu Gin Asn Thr Gly Gly Leu Glu Gly Trp Trp Leu Cys Ser Leu 
10 35 40 45 

His Gly Arg Gin Gly lie Val Pro Gly Asn Arg Val Lys Leu Leu lie 
50 55 60 

15 Gly Pro Met Gin Glu Thr Ala Ser Ser His Glu Gin Pro Ala Ser Gly 

65 70 75 80 



20 



35 



50 



65 



Leu Met Gin Gin Thr Phe Gly Gin Gin Lys Leu Tyr Gin Val Pro Asn 

85 90 95 

Pro Gin Ala Ala Pro Arg Asp Thr lie Tyr Gin Val Pro Pro Ser Tyr 

100 105 110 



Gin Asn Gin Gly lie Tyr Gin Val Pro Thr Gly His Gly Thr Gin Glu 
25 115 120 125 

Gin Glu Val Tyr Gin Val Pro Pro Ser Val Gin Arg Ser lie Gly Gly 
130 135 140 

3 0 Thr Ser Gly Pro His Val Gly Lys Lys Val lie Thr Pro Val Arg Thr 

145 150 155 160 

Gly His Gly Tyr Val Tyr Glu Tyr Pro Ser Arg Tyr Gin Lys Asp Val 
165 170 175 

Tyr Asp lie Pro Pro Ser His Thr Thr Gin Gly Val Tyr Asp He Pro 
180 185 190 

Pro Ser Ser Ala Lys Gly Pro Val Phe Ser Val Pro Val Gly Glu He 
40 195 200 205 

Lys Pro Gin Gly Val Tyr Asp He Pro Pro Thr Lys Gly Val Tyr Ala 
210 215 220 

4 5 He Pro Pro Ser Ala Cys Arg Asp Glu Ala Gly Leu Arg Glu Lys Asp 

225 230 235 240 

Tyr Asp Phe Pro Pro Pro Met Arg Gin Ala Gly Arg Pro Asp Leu Arg 
245 250 255 

Pro Glu Gly Val Tyr Asp He Pro Pro Thr Cys Thr Lys Pro Ala Gly 
260 265 270 

Lys Asp Leu His Val Lys Tyr Asn Cys Asp He Pro Gly Ala Ala Glu 
55 275 280 285 

Pro Val Ala Arg Arg His Gin Ser Leu Ser Pro Asn His Pro Pro Pro 
290 295 300 

6 0 Gin Leu Gly Gin Ser Val Gly Ser Gin Asn Asp Ala Tyr Asp Val Pro 

305 310 315 320 



Arg Gly Val Gin Phe Leu Glu Pro Pro Ala Glu Thr Ser Glu Lys Ala 
325 330 335 

Asn Pro Gin Glu Arg Asp Gly Val Tyr Asp Val Pro Leu His Asn Pro 

340 345 350 



BNSDOCID:<WO 9702362A1> 



wo 97/02362 PCT/US96/10823 



- 37 - 

Pro Asp Ala Lys Gly Ser Arg Asp Leu Val Asp Gly lie Asn Arg Leu 
355 360 365 

Ser Phe Ser Ser Thr Gly Ser Thr Arg Ser Asn Met Ser Thr Ser Ser 
5 370 375 380 

Thr Ser Ser Lys Glu Ser Ser Leu Ser Ala Ser Pro Ala Gin Asp Lys 
385 390 395 400 

10 Arg Leu Phe Leu Asp Pro Asp Thr Ala lie Glu Arg Leu Gin Arg Leu 

405 410 415 



15 



30 



45 



60 



Gin Gin Ala Leu Glu Met Gly Val Ser Ser Leu Met Ala Leu Val Thr 
420 425 430 

Thr Asp Trp Arg Cys Tyr Gly Tyr Met Glu Arg His lie Asn Glu lie 
435 440 445 



Arg Thr Ala Val Asp Lys Val Glu Leu Phe Leu Lys Glu Tyr Leu His 
20 450 455 460 

Phe Val Lys Gly Ala Val Ala Asn Ala Ala Cys Leu Pro Glu Leu lie 

465 470 475 480 

2 5 Leu His Asn Lys Met Lys Arg Glu Leu Gin Arg Val Glu Asp Ser His 

485 490 495 



Gin lie Leu Ser Gin Thr Ser His Asp Leu Asn Glu Cys Ser Trp Ser 

500 505 510 

Leu Asn lie Leu Ala lie Asn Lys Pro Gin Asn Lys Cys Asp Asp Leu 

515 520 525 



Asp Arg Phe Val Met Val Ala Lys Thr Val Pro Asp Asp Ala Lys Gin 
35 530 535 540 

Leu Thr Thr Thr lie Asn Thr Asn Ala Glu Ala Leu Phe Arg Pro Gly 

545 550 555 560 

40 Pro Gly Ser Leu His Leu Lys Asn Gly Pro Glu Ser lie Met Asn Ser 

565 570 575 



Thr Glu Tyr Pro His Gly Gly Ser Gin Gly Gin Leu Leu His Pro Gly 

580 585 590 

Asp His Lys Ala Gin Ala His Asn Lys Ala Leu Pro Pro Gly Leu Ser 

595 600 605 



Lys Glu Gin Ala Pro Asp Cys Ser Ser Ser Asp Gly Ser Glu Arg Ser 
50 610 615 620 

Trp Met Asp Asp Tyr Asp Tyr Val His Leu Gin Gly Lys Glu Glu Phe 
625 630 635 640 

5 5 Glu Arg Gin Gin Lys Glu Leu Leu Glu Lys Glu Asn lie Met Lys Gin 

645 650 655 

Asn Lys Met Gin Leu Glu His His Gin Leu Ser Gin Phe Gin Leu Leu 
660 665 670 



Glu Gin Glu lie Thr Lys Pro Val Glu Asn Asp lie Ser Lys Trp Lys 
675 680 685 



Pro Ser Gin Ser Leu Pro Thr Thr Asn Ser Gly Val Ser Ala Gin Asp 
65 690 695 700 
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Arg Gin Leu Leu Cys Phe Tyr Tyr Asp Gin Cys Glu Thr His Phe lie 

705 710 715 720 

Ser Leu Leu Asn Ala lie Asp Ala Leu Phe Ser Cys Val Ser Ser Ala 
5 725 730 735 

Gin Pro Pro Arg lie Phe Val Ala His Ser Lys Phe Val lie Leu Ser 
740 745 750 

10 Ala His Lys Leu Val Phe lie Gly Asp Thr Leu Thr Arg Gin Val Thr 

755 760 765 



15 



30 



40 



45 



60 



Ala Gin Asp lie Arg Asn Lys Val Met Asn Ser Ser Asn Gin Leu Cys 

770 775 780 

Glu Gin Leu Lys Thr lie Val Met Ala Thr Lys Met Ala Ala Leu His 

785 790 795 800 



Tyr Pro Ser Thr Thr Ala Leu Gin Glu Met Val His Gin Val Thr Asp 
20 805 810 815 

Leu Ser Arg Asn Ala Gin Leu Phe Lys Arg Ser Leu Leu Glu Met Ala 
820 825 830 

2 5 Thr Phe 



(2) INFORMATION FOR SEQ ID NO : 3 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 872 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 
3 5 (D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

Met Lys Tyr Leu Asn Val Leu Ala Lys Ala Leu Tyr Asp Asn Val Ala 
1 5 10 15 

Glu Ser Pro Asp Glu Leu Ser Phe Arg Lys Gly Asp lie Met Thr Val 
50 20 25 30 

Glu Arg Asp Thr Gin Gly Leu Asp Gly Trp Trp Leu Cys Ser Leu His 
35 40 45 

55 Gly Arg Gin Gly lie Val Pro Gly Asn Arg Leu Lys lie Leu Val Gly 

50 55 60 



Met Tyr Asp Lys Lys Pro Ala Ala Pro Gly Pro Gly Pro Pro Ala Thr 
65 70 75 80 

Pro Pro Gin Pro Gin Pro Ser Leu Pro Gin Gly Val His Thr Pro Val 
85 90 95 



Pro Pro Ala Ser Gin Tyr Ser Pro Met Leu Pro Thr Ala Tyr Gin Pro 
65 100 105 110 
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Gin Pro Asp Asn Val Tyr Leu Val Pro Thr Pro Ser hys Thr Gin Gin 
115 120 125 

Gly Leu Tyr Gin Ala Pro Gly Asn Pro Gin Phe Gin Ser Pro Pro Ala 
5 130 135 140 

Lys Gin Thr Ser Thr Phe Ser Lys Gin Thr Pro His His Ser Phe Pro 
145 150 155 160 

10 Ser Pro Ala Thr Asp Leu Tyr Gin Val Pro Pro Gly Pro Gly Ser Pro 

165 170 175 



15 



25 



30 



45 



60 



Ala Gin Asp lie Tyr Gin Val Pro Pro Ser Ala Gly Thr Gly His Asp 
180 185 190 

lie Tyr Gin Val Pro Pro Ser Leu Asp Thr Arg Ser Trp Glu Gly Thr 
195 200 205 



Lys Pro Pro Ala Lys Val Val Val Pro Thr Arg Val Gly Gin Gly Tyr 
20 210 215 220 



Val Tyr Glu Ala Ser Gin Ala Glu Gin Asp Glu Tyr Asp Thr Pro Arg 
225 230 235 240 

His Leu Leu Ala Pro Gly Ser Gin Asp lie Tyr Asp Val Pro Pro Val 
245 250 255 

Arg Gly Leu Leu Pro Asn Gin Tyr Gly Gin Glu Val Tyr Asp Thr Pro 
260 265 270 

Pro Met Ala Val Lys Gly Pro Asn Gly Arg Asp Pro Leu Leu Asp Val 
275 280 285 



Tyr Asp Val Pro Pro Ser Val Glu Lys Gly Leu Pro Pro Ser Asn His 
35 290 295 300 

His Ser Val Tyr Asp Val Pro Pro Ser Val Ser Lys Asp Val Pro Asp 

305 310 315 320 

Gly Pro Leu Leu Arg Glu Glu Thr Tyr Asp Val Pro Pro Ala Phe Ala 

325 330 335 



Lys Pro Lys Pro Phe Asp Pro Thr Arg His Pro Leu lie Leu Ala Ala 
340 345 350 

Pro Pro Pro Asp Ser Pro Pro Ala Glu Asp Val Tyr Asp Val Pro Pro 
355 360 365 



Pro Ala Pro Asp Leu Tyr Asp Val Pro Pro Gly Leu Arg Arg Pro Gly 

50 370 375 380 

Pro Gly Thr Leu Tyr Asp Val Pro Arg Glu Arg Val Leu Pro Pro Glu 

385 390 395 400 

55 Val Ala Asp Gly Ser Val lie Asp Asp Gly Val Tyr Ala Val Pro Pro 

405 410 415 



Pro Ala Glu Arg Glu Ala Pro Thr Asp Gly Lys Arg Leu Ser Ala Ser 
420 425 430 

Ser Thr Gly Ser Thr Arg Ser Ser Gin Ser Ala Ser Ser Leu Glu Val 
435 440 445 



Val Val Pro Gly Arg Glu Pro Leu Glu Leu Glu Val Ala Val Glu Thr 
65 450 455 460 
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15 



20 



25 



50 



55 



..u ^. ^^^^ 

..p .eu V.1 3„ o., 

Olu ..o 01„ p„ 

val Hi. 01. .1, V.X Hi= ox„ ^^^^ 
ser S.J «a T.. „^ ^^^^ 

JJI =1. .eu 01„ ^^^^^ 

«y ..u 3.. =1, 01. P„ ol. PHe ™. IZ 

ASP ;.p Thj v,l cy. s„ 

01„ »i, 3„ Ph, z,eu Hi. oly 3,. ^ Phe 

... THj ... .1. p„ 01. P„ 01. P„ ol„ 01. s.. Ill 3e. .eu HI. 

620 

.JU P.O Th, ..p ... .1, 3,. 3„ 11, 01„ 3„ p„ 

S.. P„ P„ 1... Th. 3., 01„ ..p se, P„ ..p 01. oi„ T.. 21 

..n s.. ol„ 01. 01. = = = 

01. ... olu Olu PH, ol„ ... .Hj 01„ ... ol„ IZ 
A.„ II. „.l ^^^^ ^ = = 

^in Ph. Olu oiu 01„ Olu V.1 3.. Ill „. 

.la r.. p„ „. oi„ 

=1. ..u 01. P,o 3.r A.P A., ol„ ..u 
C. Olu «. A.„ Thr Th. ^^^^ 2° 

Th. .la val «a Th, ol„ P.o P.o ... 11. p,. Z Ala Hi. 3.. 
..s Ph. val 11. 1.U Al. Hi. ... 1° 

.eu s„ A,, 01„ Ala ... Ala Ala Asp Val Ax, 3., ... val xh. 21 

815 
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Tyr Ser Asn Leu Leu Cys Asp Leu Leu Arg Gly lie Val Ala Thr Thr 
820 825 830 

Lys Ala Ala Ala Leu Gin Tyr Pro Ser Pro Ser Ala Ala Gin Asp Met 
5 835 840 845 

Val Asp Arg Val Lys Glu Leu Gly His Ser Thr Gin Gin Phe Arg Arg 
850 855 860 

10 Val Leu Gly Gin Leu Ala Ala Ala 

865 870 



15 



25 



40 



(2) INFORMATION FOR SEQ ID NO : 4 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 amino acids 
20 (B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 



(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

3 0 (v) FRAGMENT TYPE: C- terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

3 5 Leu Ser Gin Phe Gin Leu Leu Glu Gin Glu lie Thr Lys Pro Val Glu 

15 10 15 



Asn Asp lie Ser Lys Trp Lys Pro Ser Gin Ser Leu Pro Thr Thr Asn 
20 25 30 

Asn Ser Val Gly Ala Gin Asp Arg Gin Leu Leu Cys Phe Tyr Tyr Asp 
35 40 45 



Gin Cys Glu Thr His Phe lie Ser Leu Leu Asn Ala lie Asp Ala Leu 
45 50 55 60 

Phe Ser Cys Val Ser Ser Ala Gin Pro Pro Arg lie Phe Val 
65 70 75 
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WHAT IS CLAIMED IS: 

1, An isolated nucleic acid molecule that 
includes an open reading frame encoding a mammalian 
signal mediator protein between about 795 and about 875 
amino acids in length, said protein comprising an amino- 
terminal SH3 domain, an internal domain that includes a 
multiplicity of SH2 binding motifs, and a carboxy- 
terminal effector domain, said effector domain, when 
produced in Saccharoiuyces cerevisiae, being capable of 
inducing pseudohyphal budding in said Saccharomyces 
cerevisiae under pre-determined culture conditions. 

2. The nucleic acid molecule of claim 1, which 

15 is DNA. 

3, The DNA molecule of claim 2, which is a 
cDNA comprising a sequence approximately 3.7 kilobase 
pairs in length that encodes said signal mediator 

20 protein. 

4. The DNA molecule of claim 2, which is a 
gene, the exons of which comprise said open reading frame 
encoding said signal mediator protein. 



25 



5. The nucleic acid molecule of claim 1, which 



is RNA. 



6. An oligonucleotide between about 10 and 
30 about 100 nucleotides in length, which specifically 

hybridizes with a portion of the nucleic acid molecule of 
claim 1. 

7. The oligonucleotide of claim 6, wherein 
35 said portion includes a translation initiation site of 

said signal mediator protein. 
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8. The nucleic acid molecule of claim 1, 
wherein said open reading frame encodes a human signal 
mediator protein. 

5 9. The nucleic acid molecule of claim 8, 

wherein said open reading frame encodes a human signal 
mediator protein having an amino acid sequence 
substantially the same as Sequence I.D. No. 2. 

10 10. The nucleic acid molecule of claim 9, 

wherein said open reading frame encodes amino acid 
Sequence I.D, No. 2. 

11. The nucleic acid molecule of claim 10, 
15 which comprises Sequence I.D. No. 1. 

12. An isolated protein, which is a product of 
expression of part or all of the open reading frame of 
claim 1 . 

20 

13 . An isolated nucleic acid molecule having a 
sequence selected from the group consisting of: 

a) Sequence I.D. No. 1; 

b) a sequence hybridizing with part 
25 or all of the complementary strand of Sequence I.D. No. 1 

and encoding a polypeptide substantially the same as part 
or all of a polypeptide encoded by Sequence I.D. No. 1; 
and 

c) a sequence encoding part or all 
30 of a polypeptide having amino acid Sequence I.D. No. 2. 

14 . An isolated nucleic acid molecule having a 
sequence that encodes a carboxy- terminal effector domain 
of a mammalian signal mediator protein, said domain 

35 having an amino acid sequence greater than 74% similar to 
a portion of Sequence I.D. No. 2 comprising amino acids 
626-834. 
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15. The nucleic acid molecule of claim 14, 
wherein the amino acid sequence of said carboxy- terminal 
effector domain is greater than about 57% identical to a 
portion of Sequence I.D. No. 2 comprising amino acids 

5 626-834. 

16. The nucleic acid molecule of claim 14, 
having a sequence that encodes an amino acid sequence 
greater than 65% similar to Sequence I.D. No. 2. 

10 

17. An isolated mammalian signal mediator 
protein having a deduced molecular weight of between 
about 100 kDa and about 115 kDa; said protein comprising 
an amino- terminal SH3 domain, an internal domain that 

15 includes a multiplicity of SH2 binding motifs, and a 

carboxy- terminal effector domain, said effector domain, 
when produced in Saccharomyces cerevisiae , being capable 
of inducing pseudohyphal budding in said Saccharomyces 
cerevisiae under pre-determined culture conditions. 

20 

18. The protein of claim 17, of human origin. 



19. The protein of claim 18, having an amino 
acid sequence substantially the same as Sequence I.D. No 
25 2 . 



20. The protein of claim 19, having amino acid 
Sequence I.D. No. 2. 

30 21. An antibody immunologically specific for 

part or all of the protein of claim 17. 

22, A polypeptide produced by expression of an 
isolated nucleic acid sequence selected from the group 
35 consisting of: 

a) Sequence I.D. No. 1; 

b) a sequence hybridizing with part 
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or all of the complementary strand of Sequence I.D. No. i 
and encoding a polypeptide substantially the same as part 
or all of a polypeptide encoded by Sequence I.D. No. 1; 
and 

5 c) a sequence encoding part or all 

of a polypeptide having Sequence I.D, No. 2. 

23. An antibody immunologically specific for 
part or all of the polypeptide of claim 22. 

10 

24 . An isolated mammalian signal mediator 
protein, which comprises a carboxy- terminal effector 
domain having an amino acid sequence greater than 74% 
similar to a portion of Sequence I.D. No. 2 comprising 

15 amino acids 626-834. 

25. The protein of claim 24, wherein the amino 
acid sequence of said carboxy- terminal effector domain is 
greater than about 57% identical to a portion of Sequence 

20 I.D. No. 2 comprising amino acids 626-834. 

26. The protein of claim 24, having an amino 
acid sequence greater than 65% similar to Sequence I.D. 
No . 2 . 



25 



27. An antibody immunologically specific for 
part or all of the protein of claim 24 . 
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acccccacgctaccgaaATGAAGTATAAGAATCTTATGGCAAGGGCCTTATATGACAAT 
MKYKNLMA RALYDN 
GTCCCAGAGTGTGCCGAGGAACTGGCCTTTCGCAAGGGAGACATCCTGACCGTCATAGAG 
VPECAEELAFRKGDILT VIE 
CAGAACACAGGGGGACTGGAAGGATGGTGGCTGTGCTCGTTACACGGTCGGGAAGGCATT 
QNTGGLE GWWLCSLHGRQGI 
GTCCCAGGCAACCGGGTGAAGCTTCTGATTGGCCCCATGCAGGAGACTGCCTCCAGTCAC 
VPGNRVKLLIGPMQETASSH 
GAGCAGCCTGCCTCTGGACTGATGCAGCAGACCTTTGGCCAACAGAAGCTCTATCAAGTG 
EQPASGLMQQTFGQQKLYQV 
CCAAACCCACAGGCTGCTCCCCGAGACACTATCTACCAAGTGCCACCTTCCTACCAAAAT 
PNPQAAPRDTIYQVPPSYQN 
CAGGGAATTTACCAAGTCCCCACTGGCCACGGCACCCAAGAACAAGAGGTATATCAGGTG 
QGIYQVPTGHGTQEQEVYQV 
CCACCATCAGTGCAGAGAAGCATTGGGGGAACCAGTGGGCCCCACGTGGGTAAAAAGGTG 
PPSVQRSIGGTSGPHVGKKV 
- ATAACCCCCGTGAGGACAGGCCATGGCTACGTATACGAGTACCCATCCAGATACCAAAAG 
ITPVRTGHGYVYEYPSR.YQK 
GATGTCTATGATATCCCTCCTTCTCATACCACTCAAGGGGTATACGACATCCCTCCCTCA 
DVYDI PPSHTTQGVYDI PPS 
TCAGCAAAAGGCCCTGTGTTTTCAGTTCCAGTGGGAGAGATAAAACCTCAAGGGGTGTAT 
SAKG PVFSVPVGE I KPQGVY 
GACATCCCGCCTACAAAAGGGGTATATGCCATTCCGCCCTCTGCTTGCCGGGATGAAGCA 
DIPPTKGVYAIPPSACRDEA 
GGGCTTAGGGAAAAAGACTATGACTTCCCCCCTCCCATGAGACAAGCTGGAAGGCCGGAC 
GLRE KDYDFPPPMRQAGRPD 
CTCAGACCGGAGGGGGTTTATGACATTCCTCCAACCTGCACCAAGCCAGCAGGGAAGGAC 
LRPEGVYDIPPTCTKPAGKD 
CTTCATGTAAAATACAACTGTGACATTCCAGGAGCTGCAGAACCGGTGGCTCGAAGGCAC 
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LHVKYNCDIPGAAEPVARRH 

CAGAGCCTGTCCCCGAATCACCCACCCCCGCAACTCGGACAGTCAGTGGGCTCTCAGAAC 

QSLS PNHPPPQLGQSVGSQN 

GACGCATATGATGTCCCCCGAGGCGTTCAGTTTCTTGAGCCACCAGCAGAAACCAGTGAG 

DAYDVPR GVQFLEPPAETSE 

AAAGCAAACCCCCAGGAAAGGGATGGTGTTTATGATGTCCCTCTGCATAACCCGCCAGAT 

KANPQERDGVYDVPLHNPPD 

GCTAAAGGCTCTCGGGACTTGGTGGATGGGATCAACCGATTGTCTTTCTCCAGTACAGGC 

AKGSRDLVDGINRLSFSSTG 

AGCACCCGGAGTAACATGTCCACGTCTTCCACCTCCTCCAAGGAGTCCTCACTGTCAGCC 

STRSNMSTSSTSSKESSLSA 

TCCCCAGCTCAGGACAAAAGGCTCTTCCTGGATCCAGACACAGCTATTGAGAGACTTCAG 

SPAQDKRLFLDPDTAIERLQ 

CGGCTCCAGCAGGCCCTTGAGATGGGTGTCTCCAGCCTAATGGCACTGGTCACTACCGAC 

RLQQALEMGVSSLMALVTTD 

TGGCGGTGTTACGGATATATGGAAAGACACATCAATGAAATACGCACAGCAGTGGACAAG 

WRCYGY MERHINEIRTAVDK 

GTGGAGCTGTTCCTGAAGGAGTACCTCCACTTTGTCAAGGGAGCTGTTGCAAATGCTGCC 

VELFLKEYLHFVKGAVANAA 

TGCCTCCCGGAACTCATCCTCCACAACAAGATGAAGCGGGAGCTGCAACGAGTCGAAGAC 

CLPELILHNKMKRELQRVED 

TCCCACCAGATCCTGAGTCAAACCAGCCATGACTTAAATGAGTGCAGCTGGTCCCTGAAT 

SHQILSQTSHDLNECSWSLN 

ATCTTGGCCATCAACAAGCCCCAGAACAAGTGTGACGATCTGGACCGGTTTGTGATGGTG 

ILAINKPQNKCDDLDRFVMV 

GCAAAGACGGTGCCCGATGACGCCAAGCAGCTCACCACAACCATCAACACCAACGCAGAG 

AKTVPDDAKQLTTTINTNAE 

GCCCTCTTCAGACCCGGCCCTGGCAGCTTGCATCTGAAGAATGGGCCGGAGAGCATCATG 

ALFRPGPGSLHLKNGPESIM 

Figure IB 
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AACTCAACGGAGTACCCACACGGTGGCTCCCAGGGACAGCTGCTGCATCCTGGTGACCAC 

NSTEYPHGGSQGOLLHPGDH 

AAGGCCCAGGCCCACAACAAGGCACTGCCCCCAGGCCTGAGCAAGGAGCAGGCCCCTGAC 

KAQAHNKALPPGLSKEQAPD 

TGTAGCAGCAGTGATGGTTCTGAGAGGAGCTGGATGGATGACTACGATTACGTCCACCTA 

CSSSDGSERSWMDDYDYVHL 

CAGGGTAAGGAGGAGTTTGAGAGGCAACAGAAAGAGCTATTGGAAAAAGAGAATATCATG 

QGKEEFERQQKEL LEKENIM 

AAACAGAACAAGATGCAGCTGGAACATCATCAGCTGAGCCAGTTCCAGCTGTTGGAACAA 

KQNKMQLEHHQLSQFQ. LLEQ 

GAGATTACAAAGCCCGTGGAGAATGACATCTCGAAGTGGAAGCCCTCTCAGAGCCTACCC 

EITKPVENDISKWKPSQSLP 

ACCACAAACAGTGGCGTGAGTGCTCAGGATCGGCAGTTGCTGTGCTTCTACTATGACCAA 

TTNSGVSAQDRQLLCFYYDQ 

TGTGAGACCCATTTCATTTCCCTTCTCAACGCCATTGACGCACTCTTCAGTTGTGTCAGC 

CETHFISLLNAIDAL FSCVS 

TCAGCCCAGCCCCCGCGAATCTTCGTGGCACACAGCAAGTTTGTCATCCTCAGTGCACAC 
SAQPPRIFVAHSKFVILSAH 

AAACTGGTGTTCATTGGAGACACGCTGACACGGCAGGTGACTGCCCAGGACATTCGCAAC 

KLVFIGDTLTRQVTAQDIRN 

AAAGTCATGAACTCCAGCAACCAGCTCTGCGAGCAGCTCAAGACTATAGTCATGGCAACC 

KVMNSSNQLCEQLKTIVMAT 

AAGATGGCCGCCCTCCATTACCCCAGCACCACGGCCCTGCAGGAAATGGTGCACCAAGTG 

KMAALHYPSTTALQEMVHQV 

ACAGACCTTTCTAGAAATGCCCAGCTGTTCAAGCGCTCTTTGCTGGAGATGGCAACGTTC 

TDLSRNAQLFKRSLLEMATF 

TGAGAAGAAAAAAAAGAGGAAGGGGACTGCGTTAACGGTTACTAAGGAAAACTGGAAATA 

CTGTCTGGTTTTTGTAAATGTTATCTATTTTTGTAGATAATTTTATATAAAAATGAAATA 
TTTTAACATTTTATGGGTCAGACAACTTTCAGAAATTCAGGGAGCTGGAGAGGGAAATCT 
TTTTTTCCCCCCTGAGTXGTTCTTATGTATACACAGAAGTATCTGAGACATAAACTGTAC 
AGAAAACTTGTCCACGTCCTTTTGTATGCCCATGTATTCATGTTTTTGTTTGTAGATGTT 

Figure IC 
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TGTCTGATGCATTTCATTAAAAAAAAAACCATGAATTACGAAGCACCTTAGTAAGCACCT 
TCTAATGCTGCATTTTTTTTGTTGTTGTTAAAAACATCCAGCTGGTTATAATATTGTTCT 
CCACGTCCTTGTGATGATTCTGAGCCTGGCACTGGGAATCTGGGAAGCATAGTTTATTTG 
CAAGTGTTCACCTTCCAAATCATGAGGCATAGCATGACTTATTCTTGTTTTGAAAACTCT 
TTTCAAAACTGACCATCTTAAACACATGATGGCCAAGTGCCACAAAGCCCTCTTGCGGAG 
ACATTTACGAATATATATGTGGATCCAAGTCTCGATAGTTAGGCGTTGGAGGGAAGAGAG 
ACCAGAGAGTTTAGAGGCCAGGACCACAGTTAGGATTGGGTTGTTTCAATACTGAGAGAC 
AGCTACAATAAAAGGAGAGCAATTGCCTCCCTGGGGCTGTTCAATCTTCTGCATTTGTGA 
GTGGTTCAGTCATGAGGTTTTCCAAAAGATGTTTTTAGAGTTGTAAAAACCATATTTGCA 
GCAAAGATTTACAAAGGCGTATCAGACTATGATTGTTCACCAAAATAGGGGAATGGTTTG 
ATCCGCCAGTTGCAAGTAGAGGCCTTTCTGACTCTTAATATTCACTTTGGTGCTACTACC 
CCCATTACCTGAGGAACTGGCCAGGTCCTTGATCATGGAACTATAGAGCTACCAGACATA 
TCCTGCTCTCTAAGGGAATTTATTGCTATCTTGCACCTTCTTTAAAACTCAAAAAACATA 
TGCAGACCTGACACTCAAGAGTGGCTAGCTACACAGAGTCCATCTAATTTTTGCAACTTC 
CCCCCCCGAATTC 



Figure ID 



SUBSTITUTE SHEET (RULE 26) 



wo 97/02362 



PCT/US96/10823 



2/3 




r-l O 
W tH 



6-< CO 




<D 
OX) 



wo 



rHO 



tH O 




8NS0OCI0: <VyO 9702362A1> 



i 



wo 97/02362 

f 




PCT/US96/10823 



3/3 



HEFl 
MEFl 
pl30cas 



LSQFWLLEQEITKPVENDISKWKPSQSL . PTTts)sdv aAQDRQLLCFYYDQC£THFISL 
LSQFpLLEQEITKPVENDISKWKPSO.qr. PTTNtN^yfjAQPRQLLCFYYDOCETHF T <:^T. 



LKQFERLE2EVSRPIDHpLANlfrPA^PLVPGR 



HEFl 
MEFl 
pl30cas 



LNAIDALFSCVSSAQPPRIFV 
LNAIDALFSCVSSAOPPRIFV 



TrjrtVDA FFTAVAT NQ P PK I FV 



Figure 3 



BNSDOCID:<WO 9702362A1> 



INTERNATIONAL SEARCH REPORT 




Intemationai application No. 
PCT/US96/ 10823 



A. CLASSIFICATION OF SUBJECT MATTER 
IPC(6) :C12Q 1/68; C12P 19/34; C07H 21/02, 21/04; AolK 39/395; C07K 14/00 
US CL :435/6. 91.2; 536 23.1. 24.3; 424/138.1. 139.1. 141.1; 530/350 
According to Intemationai Patent Classification (IPC) or to both national classification a nd IPC 
FIELDS SEARCHED ~ 



B 



Minimum documentation searched (classification system foUowed by classification symbols) 
U.S. : 435/6. 91.2; 536 23.1. 24.3; 424/138.1. 139.1, 141.1; 530/350 



Documentation searched other than minimum documentation to the extent that such documents are included in the fields searched 



Electronic data base consulted during the intemationai search (name of data base and, where practicable, search terms used) 
Please See Extra Sheet. 



C. DOCUMENTS CONSIDERED TO BE RELEVANT 



Category 



Citation of document, with indication, where appropriate, of the relevant passages 



Relevant to claim No. 



Y 
A 



The EMBO Journal, Volume 13, Number 16, issued 15 
August 1994, R. Sakai et at, "A novel signaling molecule, 
p 1 30, forms stable complexes in vivo with v-Crk and v-Src in 
a tyrosine phosphorylation-dependent manner," pages 3748- 
3756, see entire article. 



E. MCCONKEY et al, "HUMAN GENETICS, THE MOLECULAR 
REVOLUTION", published 1993 by Jones and Bartlett 
Publishers, inc. (Boston, MA), pages 38-63, see entire 
document. 



1-9, 12, 13, 17- 
19, 21-23 



10, 11, 

14-16, 20, 24- 
27 

1-9, 12, 13, 
17-19, 21-23 



[_| Further documents are listed in the contmuation of Box C. Q See patent family annex. 



.p- 



Specini caugones of ciled docmncnta: -f- 

document defining the general smic of the on which ts not consklercd 
to be of particular relevance 

earlier document publiahed on or uhcr the mtcmauonal filing dot* * 

document which may throw doubts on prioritv claiin(a) or which is 
ciicd to establish the publicoiion dale of onoihcr citotion or other 
special reason (as specified) *Y* 

document referring to an ortil disclosure, use. exhibition or other 
mean* 

document published prior to the intcmaUonal filing date but later than • « • 
the priority dale claimed 



later document published ofler the international filing date or priority 
date and not m conHict with the oppUcation but ciled to undcrstojid the 
prmcipie or theory underlying the invention 

docuinent of particular relevance; the claimed invention cannot be 
considered novel or cannot be considered to involve on inventive step 
when the document is taken alone 

document of particular relevance; the claimed invention cannot be 
considered to involve an inventive step when the document is 
combined with one or more other such docuincnts. such coinbinauon 
bemg obvious to a penon skilled in the art 

document member of the same patent faintly 



Date of the actual completion of the intemationai search 
02 AUGUST 1996 



Name and mailing address of the fSA/US 
Commissioner of Palems and Tradeinarks 
Box PCT 

Washington, D.C. 20231 
Fac simile No. (703) 305-3230 



Form PCT/ISA/210 (second shcct)(Juiy 1992)^ 



Date of mailing of the international search report 



21 AUG 1996 



Authorized officer A 

DIANNE REES 
Telephone No. (703) 308-0196 




BNSDOCID:<WO 9702362A1> 



INTERNATIONAL SEARCH REPORT 



Ml 



International application No, 
PCT/US96/ 10823 



B. FIELDS SEARCHED 

Electronic data bases consulted (Name of data base and where practicable terms used): 

APS, BIOSIS, CANCERLIT. CAPLUS. CJACS. IFIPAT, MEDLINE, PROMT.SCISEARCH. JAPIO. JICST-EPLUS. 
LIFESCl. EMBASETOXLINE.TOXLIT.USPATFULL. WPIDS 

search terms: SH-2, SH-3, tyrosine kinases, pl30, Cas, Crk-associated substrate, pseudohyphal budding. 



Form PCT/ISA/210 (extra sheet)(Ju!y 1992)* 



BNSDOCID: <WO 9702362A1 > 



THIS PAQE BUNfC (uspto) 



